Skip to content

[Draft] Support the globaltimer and smid on Intel Arch #4816

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

chengjunlu
Copy link
Contributor

Support the globaltimer and smid on Intel Arch.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for globaltimer and smid functions on Intel Architecture (XPU) by implementing Intel-specific inline assembly code and updating corresponding unit tests.

  • Replaces CUDA PTX assembly with Intel inline assembly for globaltimer and smid functions
  • Adds bitwise operations to extract subslice ID from status register for smid implementation
  • Updates unit tests to support both CUDA and Intel XPU backends with appropriate assembly verification

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
third_party/intel/language/intel/utils.py Implements Intel-specific inline assembly for globaltimer and smid functions
python/test/unit/language/test_core.py Updates tests to support both CUDA and Intel XPU with backend-specific assertions

Comment on lines +6 to +9
return core.inline_asm_elementwise(
"""{\n .decl globaltimer v_type=G type=ud num_elts=2 align=qword alias=<$0, 0> \n"""
""" mov (M1_NM, 2) globaltimer(0, 0)<1> %tsc(0,0)<1;1,0> \n}""", "=rw.u", [], dtype=core.uint64, is_pure=False,
pack=1, _semantic=_semantic)
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The inline assembly string is complex and split across multiple lines with embedded newlines and escape sequences. Consider extracting this assembly code to a constant or using a more readable multiline string format to improve maintainability.

Suggested change
return core.inline_asm_elementwise(
"""{\n .decl globaltimer v_type=G type=ud num_elts=2 align=qword alias=<$0, 0> \n"""
""" mov (M1_NM, 2) globaltimer(0, 0)<1> %tsc(0,0)<1;1,0> \n}""", "=rw.u", [], dtype=core.uint64, is_pure=False,
pack=1, _semantic=_semantic)
GLOBALTIMER_ASM = (
"{\n"
" .decl globaltimer v_type=G type=ud num_elts=2 align=qword alias=<$0, 0> \n"
" mov (M1_NM, 2) globaltimer(0, 0)<1> %tsc(0,0)<1;1,0> \n"
"}"
)
return core.inline_asm_elementwise(
GLOBALTIMER_ASM, "=rw.u", [], dtype=core.uint64, is_pure=False, pack=1, _semantic=_semantic
)

Copilot uses AI. Check for mistakes.

_semantic=_semantic)
sr = core.inline_asm_elementwise("mov (M1_NM, 1) $0(0, 0)<1> %sr0(0,0)<0;1,0>", "=rw.u", [], dtype=core.uint32,
is_pure=True, pack=1, _semantic=_semantic)
pos: core.constexpr = core.constexpr(9)
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 9 should be documented or extracted to a named constant to explain what bit position it represents in the status register.

Suggested change
pos: core.constexpr = core.constexpr(9)
pos: core.constexpr = core.constexpr(STATUS_REGISTER_BIT_POSITION)

Copilot uses AI. Check for mistakes.

sr = core.inline_asm_elementwise("mov (M1_NM, 1) $0(0, 0)<1> %sr0(0,0)<0;1,0>", "=rw.u", [], dtype=core.uint32,
is_pure=True, pack=1, _semantic=_semantic)
pos: core.constexpr = core.constexpr(9)
subslice_mask: core.constexpr = core.constexpr((1 << 11) - 1)
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 11 should be documented or extracted to a named constant to explain the bit width of the subslice mask.

Suggested change
subslice_mask: core.constexpr = core.constexpr((1 << 11) - 1)
SUBSLICE_MASK_BIT_WIDTH = 11 # Bit width of the subslice mask
subslice_mask: core.constexpr = core.constexpr((1 << SUBSLICE_MASK_BIT_WIDTH) - 1)

Copilot uses AI. Check for mistakes.

@etiotto etiotto marked this pull request as draft July 31, 2025 13:55
Signed-off-by: Lu,Chengjun <[email protected]>

inline asm for smid and global timer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant